A supervised learning method for classifying methylation disorders

Background DNA methylation is one of the most stable and well-characterized epigenetic alterations in humans. Accordingly, it has already found clinical utility as a molecular biomarker in a variety of disease contexts. Existing methods for clinical diagnosis of methylation-related disorders focus on outlier detection in a small number of CpG sites using standardized cutoffs which differentiate healthy from abnormal methylation levels. The standardized cutoff values used in these methods do not take into account methylation patterns which are known to differ between the sexes and with age. Results Here we profile genome-wide DNA methylation from blood samples drawn from within a cohort composed of healthy controls of different age and sex alongside patients with Prader–Willi syndrome (PWS), Beckwith–Wiedemann syndrome, Fragile-X syndrome, Angelman syndrome, and Silver–Russell syndrome. We propose a Generalized Additive Model to perform age and sex adjusted outlier analysis of around 700,000 CpG sites throughout the human genome. Utilizing z-scores among the cohort for each site, we deployed an ensemble based machine learning pipeline and achieved a combined prediction accuracy of 0.96 (Binomial 95% Confidence Interval 0.868\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.995). Conclusion We demonstrate a method for age and sex adjusted outlier detection of differentially methylated loci based on a large cohort of healthy individuals. We present a custom machine learning pipeline utilizing this outlier analysis to classify samples for potential methylation associated congenital disorders. These methods are able to achieve high accuracy when used with machine learning methods to classify abnormal methylation patterns.

Table S1.List of Relevant Probes and Disease Associations.
Table S2.Accumulative variance explained by principal components calculated from the 109,131 abovebackground-noise probes among 283 samples included in the studied cohort.
Table S3.Possibility of each samples in test sets being predicted as each of the classes by the classifier trained in this study after adjustment and calibration.

Supplementary Figures.
Fig. S1.Principal components of SRS.Global features represented by selected UMAP dimensions were not able to distinguish SRS patients from the cohort.Fig. S2.Clusters.K-Mean clustering based on highly variable probes or 30 UMAP dimensions extracted from these probes did not put SRS patients in a single cluster.The entire cohort was forced to be clustered into 3 groups based on probes of which the methylation level showing deviation higher than 1.5 or UMAP dimensions extracted from these highly variable probes.Clustering results for SRS patients were shown in the heatmap.Probes with stronger age and/or sex effects show greater difference between the power of the adjusted and unadjusted model, and show very little difference when there is almost no age and/or sex effect for that probe.

Fig
Fig. S2.Clusters.Fig.S3.Identify the principal component that capture sex difference in methylation.

Fig. S4 .
Fig. S4.Identify the principal component that capture age difference in methylation.

Fig. S6 .
Fig. S6.Clustering based on adjusted z-scores of probes known to be involved in the abnormalities included in this study.

Fig. S8 .
Fig. S8.Power analysis simulations exploring effect size over and different probes.

Figure S3 .
Figure S3.Sex difference in methylation.Pair plot for 10 principal components are calculated based on normalized beta values of 109,131 above-background-noise probes among 283 samples included in the studied cohort.Sex was color coded.

Figure S4 .
Figure S4.Age difference in methylation.Pair plot for 10 principal components are calculated based on normalized beta values of 109,131 above-background-noise probes among 283 samples included in the studied cohort.Sex was color coded.Age group was color coded.

Fig. S5 .
Fig. S5.Beta plots for select probes.Probe beta values plotted against subject age in years for normal (apparently healthy) samples.Females are shown in green, males in red, with the mean line in the center and shading for two standard deviations from the mean.Four select probes are shown as examples of age and sex effects.For each probe, a Welch's two sample t-test is used to compare the male and female groups testing for significant differences between male and female sexes.Also shown is the Spearman correlation comparing the combined male and female samples against age to test for statistically significant correlation between beta values and age.Probes in (a) and (b) show relatively low age effect, although the age effect was statistically significant (pval < 0.05) in (a).In (b), a strong sex effect can be observed.Probes in (c) and (d) are represented in the list of probes published by Horvath's clock (Horvath 2013) as probes shown to have strong age effects.

Figure S6 .
Figure S6.Clustering based on adjusted z-scores.z-scores were calculated based on the probe beta values of the samples and adjusted for age and sex used by the model developed in this study.The samples were clustered using k-means method.

Fig. S7 .
Fig. S7.Scatteredness.The distribution of z-scores calculated based on raw (a) or sex/age adjusted (b) methylation levels (beta) of target probes in male and female normal samples.

Fig. S8 .
Fig. S8.Power analysis simulations exploring effect size and different probes.The age and sex adjusted GAMLSS model is shown in red, while the unadjusted global mean model is in blue.The simulation was performed for the effect sizes 0.1, 0.2, and 0.3 (going from left to right columns) for three different probes.Probes cg05816130, cg07158339, and cg08434396 were arbitrarily chosen by visual inspection of betas to represent weak, medium, and strong effects from age and/or sex (from top row to bottom).Probes with stronger age and/or sex effects show greater difference between the power of the adjusted and unadjusted model, and show very little difference when there is almost no age and/or sex effect for that probe.

Figure S9 .
Figure S9.AutoGluon 5-fold cross-validation accuracies.The validation accuracy of models used for ensemble learning.

Table S2 .
Accumulative variance explained by principal components calculated from the 109,131 abovebackground-noise probes among 283 samples included in the studied cohort

Table S3 .
Possibility of each sample in test set being predicted as each of the classes by the classifier trained in this study after adjustment and calibration